{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Ensemble Methods\n",
    "### Agenda\n",
    "\n",
    "<hr>\n",
    "1. Introduction to Ensemble Methods\n",
    "2. RandomForest\n",
    "3. AdaBoost\n",
    "4. GradientBoostingTree\n",
    "5. VotingClassifier\n",
    "\n",
    "<hr>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1. Introduction to Ensemble Method\n",
    "* Objective of ensemble methods is to combine the predictions of serveral base estimators ( Linear Regression, Decisison Tree, etc. ) to create a combined effect or more genralized model.\n",
    "* Two types of Ensemble Method\n",
    "  - Averaging Method : Build several estimators independently & average their predictions. Examples are RandomForest etc.\n",
    "  - Boosting Method : Base estimators are built sequentially using weighted version of data .i.e fitting models with data that were mis-classified. Examples are AdaBoost\n",
    "  \n",
    "<img src=\"https://cdn-images-1.medium.com/max/1000/1*PaXJ8HCYE9r2MgiZ32TQ2A.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2. RandomForest\n",
    "* Recap - Limitations of decison tree is that it overfits & shows high variance.\n",
    "* RandomForest is an averaging ensemble method whose prediction is function of prediction of 'n' decision trees.\n",
    "\n",
    "<img src=\"https://www.researchgate.net/profile/Stavros_Dimitriadis/publication/324517994/figure/fig1/AS:615965951799303@1523869135381/Classification-process-based-on-the-Random-Forest-algorithm-2.png\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Algorithm\n",
    "* Data consist of R rows & M features.\n",
    "* Sample of training data is taken.\n",
    "* Random set of features are selected.\n",
    "* As many as configured number of trees are created using above two steps.\n",
    "* Final prediction in case of classification is majority prediction.\n",
    "* Final prediction in case of regression is mean/median of individual tree prediction"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Comparing Decision Tree & Random Forest for MNIST data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.datasets import load_digits\n",
    "import numpy as np\n",
    "from sklearn.ensemble import RandomForestClassifier\n",
    "from sklearn.tree import DecisionTreeClassifier\n",
    "from sklearn.model_selection import train_test_split"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "digits = load_digits()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = digits.data\n",
    "y = digits.target"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "trainX, testX, trainY, testY = train_test_split(X,y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "dt = DecisionTreeClassifier()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,\n",
       "            max_features=None, max_leaf_nodes=None,\n",
       "            min_impurity_decrease=0.0, min_impurity_split=None,\n",
       "            min_samples_leaf=1, min_samples_split=2,\n",
       "            min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n",
       "            splitter='best')"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dt.fit(trainX,trainY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.8444444444444444"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "dt.score(testX,testY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "rf = RandomForestClassifier()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\awant\\Anaconda3\\lib\\site-packages\\sklearn\\ensemble\\forest.py:248: FutureWarning: The default value of n_estimators will change from 10 in version 0.20 to 100 in 0.22.\n",
      "  \"10 in version 0.20 to 100 in 0.22.\", FutureWarning)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',\n",
       "            max_depth=None, max_features='auto', max_leaf_nodes=None,\n",
       "            min_impurity_decrease=0.0, min_impurity_split=None,\n",
       "            min_samples_leaf=1, min_samples_split=2,\n",
       "            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=None,\n",
       "            oob_score=False, random_state=None, verbose=0,\n",
       "            warm_start=False)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rf.fit(trainX,trainY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9422222222222222"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rf.score(testX,testY)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Important Hyper-parameters\n",
    "* n_estimators : number of trees to be configured, larger is better but compute cost.\n",
    "* max_features : maximum number of features to be considered for splitting the node. For classification this equals to sqrt(n_features). And, for regression max_features = n_features.\n",
    "* n_jobs : Configure as -1 so that we can make use of all cores."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Advantages\n",
    "* Minimal data cleaning or dealing with missing values required.\n",
    "* Works well with high dimensional datasets\n",
    "* Minimizes variance even for low variance models\n",
    "* RandomForest can tell importance of features. We can find important features & use them in model training"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.        , 0.00153452, 0.0344571 , 0.01067713, 0.01082384,\n",
       "       0.0112286 , 0.01315547, 0.00079271, 0.        , 0.00636646,\n",
       "       0.02929224, 0.00748655, 0.01034612, 0.02012342, 0.00530885,\n",
       "       0.00044021, 0.        , 0.00869604, 0.0159474 , 0.02412374,\n",
       "       0.01963923, 0.05220618, 0.00815159, 0.00014153, 0.        ,\n",
       "       0.01481827, 0.03841265, 0.02457494, 0.0387679 , 0.01440344,\n",
       "       0.04183173, 0.00012386, 0.        , 0.04833531, 0.02064672,\n",
       "       0.01335855, 0.03460345, 0.02480548, 0.04399556, 0.        ,\n",
       "       0.        , 0.013969  , 0.02832915, 0.04130284, 0.02356978,\n",
       "       0.0158596 , 0.01057897, 0.        , 0.        , 0.00519296,\n",
       "       0.01254092, 0.02459812, 0.01702702, 0.02288651, 0.0375919 ,\n",
       "       0.00183362, 0.        , 0.00197704, 0.0181349 , 0.01630303,\n",
       "       0.02424614, 0.01287656, 0.01998931, 0.00157587])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rf.feature_importances_"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3. AdaBoost\n",
    "* Boosting in general is about building a model from the training data, then creating a second model that attempts to correct the errors from the first model. Models are added until the training set is predicted perfectly or a maximum number of models are added.\n",
    "* AdaBoost was first boosting algorithm.\n",
    "* AdaBoost can be used for both classification & regression\n",
    "\n",
    "##### Algorithm\n",
    "* Core concept of adaboost is to fit weak learners ( like decision tree ) sequantially on repeatedly modifying data.\n",
    "* Initially, each data is assigned equal weights.\n",
    "* A base estimator is fitted with this data.\n",
    "* Weights of misclassified data are increased & weights of correctly classified data is decreased. \n",
    "* Repeat the above two steps till all data are correctly classified or max number of iterations configured.\n",
    "* Making Prediction : The predictions from all of them are then combined through a weighted majority vote (or sum) to produce the final prediction."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.ensemble import AdaBoostClassifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [],
   "source": [
    "ab = AdaBoostClassifier(base_estimator=DecisionTreeClassifier(max_depth=8),n_estimators=600)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AdaBoostClassifier(algorithm='SAMME.R',\n",
       "          base_estimator=DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=8,\n",
       "            max_features=None, max_leaf_nodes=None,\n",
       "            min_impurity_decrease=0.0, min_impurity_split=None,\n",
       "            min_samples_leaf=1, min_samples_split=2,\n",
       "            min_weight_fraction_leaf=0.0, presort=False, random_state=None,\n",
       "            splitter='best'),\n",
       "          learning_rate=1.0, n_estimators=600, random_state=None)"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ab.fit(trainX,trainY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9822222222222222"
      ]
     },
     "execution_count": 45,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ab.score(testX,testY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [],
   "source": [
    "ab = AdaBoostClassifier(base_estimator=RandomForestClassifier(n_estimators=20),n_estimators=600)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "AdaBoostClassifier(algorithm='SAMME.R',\n",
       "          base_estimator=RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',\n",
       "            max_depth=None, max_features='auto', max_leaf_nodes=None,\n",
       "            min_impurity_decrease=0.0, min_impurity_split=None,\n",
       "            min_samples_leaf=1, min_samples_split=2,\n",
       "            min_weight_fraction_leaf=0.0, n_estimators=20, n_jobs=None,\n",
       "            oob_score=False, random_state=None, verbose=0,\n",
       "            warm_start=False),\n",
       "          learning_rate=1.0, n_estimators=600, random_state=None)"
      ]
     },
     "execution_count": 49,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ab.fit(trainX,trainY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9644444444444444"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ab.score(testX,testY)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 4. GradientBoostingTree\n",
    "* A machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees.\n",
    "* One of the very basic assumption of linear regression is that it's sum of residuals is 0.\n",
    "* These residuals as mistakes committed by our predictor model. \n",
    "* Although, tree based models are not based on any of such assumptions, but if sum of residuals is not 0, then most probably there is some pattern in the residuals of our model which can be leveraged to make our model better. \n",
    "* So, the intuition behind gradient boosting algorithm is to leverage the pattern in residuals and strenghten a weak prediction model, until our residuals don't show any pattern.\n",
    "* Algorithmically, we are minimizing our loss function, such that test loss reach it’s minima."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Problem : House Price Prediction using GradientBoostingTree"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.datasets import load_boston\n",
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [],
   "source": [
    "house_data = load_boston()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = house_data.data\n",
    "y = house_data.target"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.ensemble import GradientBoostingRegressor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [],
   "source": [
    "gbt = GradientBoostingRegressor()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "GradientBoostingRegressor(alpha=0.9, criterion='friedman_mse', init=None,\n",
       "             learning_rate=0.1, loss='ls', max_depth=3, max_features=None,\n",
       "             max_leaf_nodes=None, min_impurity_decrease=0.0,\n",
       "             min_impurity_split=None, min_samples_leaf=1,\n",
       "             min_samples_split=2, min_weight_fraction_leaf=0.0,\n",
       "             n_estimators=100, n_iter_no_change=None, presort='auto',\n",
       "             random_state=None, subsample=1.0, tol=0.0001,\n",
       "             validation_fraction=0.1, verbose=0, warm_start=False)"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gbt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.model_selection import train_test_split"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "trainX, testX, trainY, testY = train_test_split(X,y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "GradientBoostingRegressor(alpha=0.9, criterion='friedman_mse', init=None,\n",
       "             learning_rate=0.1, loss='ls', max_depth=3, max_features=None,\n",
       "             max_leaf_nodes=None, min_impurity_decrease=0.0,\n",
       "             min_impurity_split=None, min_samples_leaf=1,\n",
       "             min_samples_split=2, min_weight_fraction_leaf=0.0,\n",
       "             n_estimators=100, n_iter_no_change=None, presort='auto',\n",
       "             random_state=None, subsample=1.0, tol=0.0001,\n",
       "             validation_fraction=0.1, verbose=0, warm_start=False)"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "gbt.fit(trainX,trainY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_score = np.zeros(100, dtype=np.float64)\n",
    "for i, y_pred in enumerate(gbt.staged_predict(testX)):\n",
    "    test_score[i] = gbt.loss_(testY, y_pred)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Text(0,0.5,'Least squares Loss')"
      ]
     },
     "execution_count": 72,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEKCAYAAAAfGVI8AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvNQv5yAAAIABJREFUeJzt3Xl0XeV57/HvcwbN82TkQbbBBseE2mBDmEKBTIQM0CSQ5JKU5tLSIWmmNqnb3nvbZK27LmnTlPS2pWWFEDKRJpAyNUAcB8JNQgg22AaMwWAwni0PsmTNOue5f+wtI4hkHcva2tLZv89aZx3trTM82wfOT+/77v2+5u6IiEhypeIuQERE4qUgEBFJOAWBiEjCKQhERBJOQSAiknAKAhGRhFMQiIgknIJARCThFAQiIgmXibuAQjQ1NfmCBQviLkNEZEZZt27dfndvHu9xMyIIFixYwNq1a+MuQ0RkRjGzbYU8Tl1DIiIJpyAQEUk4BYGISMIpCEREEk5BICKScAoCEZGEUxCIiCRcUQfBXU/u5Nu/Kug0WhGRxCrqIPivp3YrCERExlHUQTCrppS9nX1xlyEiMq0VdxBUl3GoZ5D+oVzcpYiITFtFHQQtNaUA7Ovsj7kSEZHpq8iDoAyAfV0KAhGRsRR1EMyqDoNA4wQiImMq6iAY7hrSgLGIyNiKOggaKkrIpExdQyIixxBpEJhZnZndYWabzexZMzvPzBrMbLWZbQnv66N6/1TKaKkuZa8Gi0VExhR1i+CrwAPuvgRYBjwLrALWuPtiYE24HZmWmjL2dalrSERkLJEFgZnVABcBtwC4+4C7dwBXALeFD7sNuDKqGoCwRaAgEBEZS5QtgpOBduBWM3vSzL5mZpXALHffDRDet4z2ZDO73szWmtna9vb2CRcxq6ZMYwQiIscQZRBkgLOAm9z9TKCb4+gGcveb3X2lu69sbm6ecBGzakrp6Bmkb1BXF4uIjCbKINgB7HD3x8LtOwiCYa+ZtQKE9/sirIGW8FqCdrUKRERGFVkQuPseYLuZnRbueguwCbgHuDbcdy1wd1Q1wIhpJjRgLCIyqkzEr/+nwHfMrATYCnyMIHy+b2bXAa8AV0VZwKxwmgmdQioiMrpIg8Dd1wMrR/nVW6J835FeDQK1CERERlPUVxYD1FdkyaZNLQIRkTEUfRCYGS3VuqhMRGQsRR8EEAwYa00CEZHRJSMIdHWxiMiYEhEEurpYRGRsiQmCw726ulhEZDSJCIKWaq1dLCIylmQEwfC1BDpzSETkNyQiCGbVqEUgIjKWZARBta4uFhEZSyKCoK4iS0k6pa4hEZFRJCIIzIzm6lLa1TUkIvIbEhEEEIwTqEUgIvKbEhQEZew5rCAQEXm9xAWBu8ddiojItJKYIJhdV0b3QI7OvqG4SxERmVYSFATlAOw+3BtzJSIi00tigqC1NgyCDo0TiIiMlJggmF0XXFS2s0MtAhGRkRITBC3VZaRTpq4hEZHXSUwQpFPGSTVl6hoSEXmdxAQBQGttGbvUIhAReY1kBUFdObvUIhAReY1EBcHsuuCisnxeF5WJiAxLVhDUljOQy3OgeyDuUkREpo1Ig8DMXjazp8xsvZmtDfc1mNlqM9sS3tdHWcNIrbXBKaS7dAqpiMhRU9EiuMTdl7v7ynB7FbDG3RcDa8LtKaGri0VEflMcXUNXALeFP98GXDlVbzwcBBowFhF5VdRB4MCPzWydmV0f7pvl7rsBwvuWiGs4qr4iS2kmpRaBiMgImYhf/wJ332VmLcBqM9tc6BPD4LgeoK2tbVKKMTNm6xRSEZHXiLRF4O67wvt9wH8C5wB7zawVILzfN8Zzb3b3le6+srm5edJqml2ni8pEREaKLAjMrNLMqod/Bt4OPA3cA1wbPuxa4O6oahhNa225ppkQERkhyq6hWcB/mtnw+3zX3R8ws8eB75vZdcArwFUR1vAbZteWsberj8Fcnmw6UZdRiIiMKrIgcPetwLJR9h8A3hLV+45ndl057rC3s4+59RVxlSEiMm0k7k/i1qPXEqh7SEQEEhgEs3V1sYjIaxxXEJhZvZn9VlTFTIVWXVQmIvIa4waBmT1sZjVm1gBsAG41s69EX1o0qkoz1JRldFGZiEiokBZBrbt3Au8DbnX3FcBboy0rWrqoTETkVYUEQSa88Otq4L6I65kSrbVlahGIiIQKCYIvAg8CL7j742Z2MrAl2rKiNbuunJ0aLBYRAQq4jsDdfwD8YMT2VuD9URYVtbn1FXT0DNLVN0h1WTbuckREYlXIYPHfhYPFWTNbY2b7zewjU1FcVOY3BheSvXKwJ+ZKRETiV0jX0NvDweJ3AzuAU4HPRVpVxNoawiA4oCAQESkkCIb7Ti4Hbnf3gxHWMyXa1CIQETmqkLmG7g3XEegF/sTMmoEZfe5lTVmW+oos2xQEIiLjtwjcfRVwHrDS3QeBboLlJme0toYKdQ2JiFBAi8DMssBHgYvCKaV/BvxbxHVFrq2xkg3bO+IuQ0QkdoWMEdwErAD+NbydFe6b0eY3VLCzo5fBXD7uUkREYlXIGMHZ7j5yXYGfmtmGqAqaKm0NFeTyzu6OvqODxyIiSVRIiyBnZqcMb4RXFueiK2lqDH/5bzvYHXMlIiLxKqRF8DngITPbChgwH/hYpFVNgeGLyrYd6OHNi2MuRkQkRoVMMbHGzBYDpxEEwWZgedSFRW1WdRklmRTbdQqpiCRcQWsWu3s/sHF428x+ALRFVdRUSKWMefXlbNMppCKScBNdqtImtYqYtDVU6KIyEUm8iQaBT2oVMZnfWMn2gz24F8XhiIhMyJhdQ2Z2L6N/4RvQGFlFU6itoYIj/UMc7B6gsao07nJERGJxrDGCL0/wdzPG0VlID/YoCEQkscYMAnf/2VQWEoeR6xKc2VYfczUiIvGY6BhBUZjX8Oq1BCIiSRV5EJhZ2syeNLP7wu2FZvaYmW0xs/8ws5KoaxhLWTbNrJpSrUsgIol2XEFgZikzqznO9/gU8OyI7S8B/+jui4FDwHXH+XqTan5DpaajFpFEK2TN4u+GaxZXApuA58ysoKUqzWwu8C7ga+G2AZcCd4QPuQ24ciKFT5Z5DRVqEYhIohXSIlgarll8JfAjgiuKP1rg698IfB4Ynuu5Eehw96FwewcwZ7Qnmtn1ZrbWzNa2t7cX+HbHb0FjBXs6++gZGBr/wSIiRaigNYvDxWmuBO4OVykb9wosM3s3sM/d143cPcpDR30td7/Z3Ve6+8rm5uYCypyYRS1VAGxt1yykIpJMhQTBvwMvA5XAI2Y2H+gs4HkXAO81s5eB7xF0Cd0I1JnZ8Gmrc4Fdx1nzpFo8KwiCLfu64ixDRCQ2haxZ/E/uPsfdL/fANuCSAp73l+4+190XAB8Cfuru1wAPAR8IH3YtcPfEyz9x8xsryaSMLXuPxFmGiEhsChksnmVmt5jZ/eH2UoIv8In6C+CzZvYCwZjBLSfwWicsm06xsKmSLfsUBCKSTIV0DX0DeBCYHW4/D3z6eN7E3R9293eHP29193PcfZG7XxVOcR2rxbOq2LJXXUMikkyFBEGTu3+f8Myf8IyfGb9U5UiLWqp55WAPfYNFdVgiIgUpJAi6zayR8OweMzsXOBxpVVNscUsVedeZQyKSTIWsUPZZ4B7gFDP7BdDMq4O9RWHkmUNLZx/vhdMiIjPbMYPAzFJAGfDbvLpm8XPhtQRFY2FTJemU8YIGjEUkgY4ZBO6eN7N/cPfzgGemqKYpV5pJM7+xQqeQikgiFTJG8GMze384T1DRWtxSpYvKRCSRCh0jqASGzKyPoHvI3b2oOtMXt1Tzk2f3MTCUpyST6GUaRCRhCrmyuNrdU+5e4u414XZRhQAEA8a5vPPSfp05JCLJUkiLADOrBxYTDBwD4O6PRFVUHIYnn9uyr4vTTqqOuRoRkakzbhCY2e8TLC4zF1gPnAs8SjCJXNE4pbkKMzRgLCKJU0hn+KeAs4Ft7n4JcCYQ3QIBMSnLpmlrqNAppCKSOIUEQZ+79wGYWam7bya4pqDo6MwhEUmiQoJgh5nVAXcBq83sbmJeQyAqi1qqeWl/N4O5/PgPFhEpEuOOEbj774Q//q2ZPQTUAg9EWlVMlpxUzWDO2bL3iKaaEJHEKGSwuG3E5kvh/UnAK5FUFKNl8+oA2LCjQ0EgIolRyOmj/0Uw86gRnD66EHgOOD3CumKxoLGC2vIsG7Z38OFz2sZ/gohIESika+iMkdtmdhbwh5FVFCMzY9m8OtZv74i7FBGRKXPccym4+xMEp5MWpeXz6nh+bxfd/UNxlyIiMiUKGSP47IjNFHAWRXgdwbDl82rJOzy18zDnntwYdzkiIpErpEVQPeJWSjBmcEWURcVp2dxwwFjdQyKSEIWMEXxhKgqZLhqrSpnXUM6GHQoCEUmGQrqG7jnW7939vZNXzvSwfF49614+GHcZIiJTopDTR18iuG7g2+H2h4GXgQcjqil2y+bWcu+GXezr7KOlpmz8J4iIzGCFBMGZ7n7RiO17zewRd/+rqIqK25ltwTjB+u0dvP30k2KuRkQkWoUMFjeb2cnDG2a2EGiOrqT4nT67lkzKNE4gIolQSIvgM8DDZrY13F5AAReUmVkZ8AjBmUYZ4A53/5swSL4HNABPAB9194EJ1B6ZsmyaJa3VurBMRBKhkLOGHjCzxcCScNdmd+8v4LX7gUvd/YiZZYGfm9n9BGsg/6O7f8/M/g24DrhpgvVHZtncOu5Zv4t83kmlLO5yREQiM27XkJldBZS4+wbgPcDt4TQTx+SB4VVesuHNCVY2uyPcfxtw5UQKj9ryeXV09Q+xdb8WqhGR4lbIGMH/dPcuM7sQeAfBl3dBf8GbWdrM1gP7gNXAi0CHuw/P37ADmHP8ZUfvrPn1AKzbdijmSkREolVIEOTC+3cBN7n73UBJIS/u7jl3X06w3vE5wBtGe9hozzWz681srZmtbW+f+hktTm6qpK4iqyAQkaJXSBDsNLN/B64GfmRmpQU+7yh37wAeJlj4vs7Mhscm5jLGamfufrO7r3T3lc3NU3+Skpmxoq1eQSAiRa+QL/SrCS4euyz8Qm8APjfek8ysOVziEjMrB94KPAs8BHwgfNi1wN0TqHtKnDW/nhfbuznUPa1OahIRmVTjBoG797j7D919S7i9291/XMBrtwIPmdlG4HFgtbvfB/wF8FkzewFoBG6ZePnRWhGOEzy5Xa0CESlehVxHMCHuvhE4c5T9WwnGC6a9ZXPrSKeMddsOcemSWXGXIyISieNemCZJykvSnD67RuMEIlLUCrmO4EuF7CtWZ7XVs2H7YQZz+bhLERGJRCEtgreNsu+dk13IdLVifj29gzk27+6KuxQRkUiMGQRm9sdm9hRwmpltHHF7Cdg4dSXGa8XRC8u0PoGIFKdjDRZ/F7gf+D/AqhH7u9w9Md+Ks+vKaa0tY90rHfzeBXFXIyIy+cZsEbj7YXd/GfgfwB533wYsBD4yfH1AUpw1v54nNGAsIkWqkDGCO4GcmS0iOOd/IUFrITFWtNWzs6OX3Yd74y5FRGTSFRIE+XCSuPcBN7r7ZwguFkuMcxY2APDLFw7EXImIyOQrJAgGzezDwO8C94X7stGVNP0sba2hqaqUh5+f+snvRESiVkgQfAw4D/jf7v5SuMLYt8d5TlFJpYyLT2vmkefbyeVHnSxVRGTGKmSuoU3u/kl3vz3cfsndb4i+tOnl4tOaOdw7yHrNOyQiRaaQK4sXm9kdZrbJzLYO36aiuOnkzYuaSRk8/Jy6h0SkuBTSNXQrwYpkQ8AlwDeBb0VZ1HRUW5Flxfx6HnpuX9yliIhMqkKCoNzd1wDm7tvc/W8J1h1OnItPa+HpnZ3s6+qLuxQRkUlTSBD0mVkK2GJmnzCz3wFaIq5rWrr4tGCltJ+pe0hEikghQfBpoAL4JLAC+AjBymKJs7S1hpZqnUYqIsVl3IVp3P1xADNzd/9Y9CVNX2bGb5/azIPP7GEolyeT1nIOIjLzFXLW0HlmtolgvWHMbJmZ/WvklU1TlyxpobNvSIvViEjRKORP2huBdwAHANx9A3BRlEVNZxed2kx5Ns1d63fGXYqIyKQoqG/D3be/blcuglpmhKrSDJef0cq9G3bTO5DYfwYRKSKFBMF2MzsfcDMrMbM/J+wmSqqrVs7lSP8Q9z+9O+5SREROWCFB8EfAx4E5wA5gOfAnURY13b1pYQNtDRX8YO2OuEsRETlhhcw1tN/dr3H3We7e4u4fIZiJNLHMjA+smMujWw+w/WBP3OWIiJyQiZ7/+NlJrWIGev+KuZjBHevUKhCRmW2iQWCTWsUMNKeunAsXNXHHuh3kNTW1iMxgEw0CffMBH1gxl50dvfzyRa1cJiIz15hBYGZdZtY5yq0LmD3eC5vZPDN7yMyeNbNnzOxT4f4GM1ttZlvC+/pJPJ4p9Y7TT6KuIst3HtsWdykiIhM2ZhC4e7W714xyq3b3caemIJi2+s/c/Q3AucDHzWwpsApY4+6LgTXh9oxUlk3zwZXz+PGmvVrYXkRmrMgmy3H33e7+RPhzF8G1B3OAK4DbwofdBlwZVQ1T4Zo3zSfvzu2PvRJ3KSIiEzIls6aZ2QLgTOAxYJa774YgLBhjSmszu97M1prZ2vb26TvbZ1tjBRef2sx3f72dgaF83OWIiBy3yIPAzKqAO4FPu3tnoc9z95vdfaW7r2xubo6uwEnwu+ctYP+Rfh58Zk/cpYiIHLdIg8DMsgQh8B13/2G4e6+ZtYa/bwVm/NqPv31qM20NFXzrUQ0ai8jME1kQmJkBtwDPuvtXRvzqHl5d2OZa4O6oapgqqZTxkXPb+PXLB9m8p+BGj4jItBBli+AC4KPApWa2PrxdDtwAvM3MtgBvC7dnvKtXzqM8m+arP9kSdykiIselkNNAJ8Tdf87YVyC/Jar3jUtdRQl/fPEpfGX18zy29QBvOrkx7pJERAqitRYn0R+8+WRm15bxxfs2kdO0EyIyQygIJlF5SZpVl7+BZ3Z1cqcmoxORGUJBMMne81utnNVWx989+BxH+ofiLkdEZFwKgklmZvyv95zO/iP9/PNPX4i7HBGRcSkIIrB8Xh3vP2sut/x8K1vbj8RdjojIMSkIIvIX7zyN0kyaL9y7CXcNHIvI9KUgiEhLdRmffutifvZ8Oz95dsZfPC0iRUxBEKFrz1/A4pYqvnjfM/QN5uIuR0RkVAqCCGXTKb7w3tPZfrCXr67RFcciMj0pCCJ2/qImPnT2PG56+EUtdC8i01JkU0zIq754xRvZfqiHVXduZHZtGecvaoq7JBGRo9QimAIlmRT/es0KTm6u5A+/vY4te7viLklE5CgFwRSpLc/y9d87m7Jsmj/45lo6+wbjLklEBFAQTKm59RXcdM1ZbD/Uy+d/sFHXF4jItKAgmGIrFzSw6rIlPPDMHm75+UtxlyMioiCIw++/eSFvXzqLG+7fzNqXD8ZdjogknIIgBmbG31+1jDn15Xzy9ic53KvxAhGJj4IgJrXlWW784HL2dvXzhXueibscEUkwBUGMzmyr5xOXLOKHT+7kvzbujrscEUkoBUHMPnHpIpbNreWv73qKfZ19cZcjIgmkIIhZNp3iKx9cTt9gjj/45lo27eqMuyQRSRgFwTRwSnMVX7l6OS8f6OFd//f/8fk7NrBXrQMRmSIKgmni8jNaeeRzl/D7Fy7krid3cemXH+bbv9pGPq+LzkQkWgqCaaS2Istfv2spqz97EWe21fM/7nqaa772GK8c6Im7NBEpYgqCaWh+YyXfuu4cbnjfGTy98zDvuPERbv3FS2odiEgkIgsCM/u6me0zs6dH7Gsws9VmtiW8r4/q/Wc6M+ND57Tx4Gcu4k0nN/CFezdx9b8/ytb2I3GXJiJFJsoWwTeAy163bxWwxt0XA2vCbTmG2XXl3Pp7Z/MPVy3j+b1dvOPGR1h150Ze2t8dd2kiUiQsyhkwzWwBcJ+7vzHcfg642N13m1kr8LC7nzbe66xcudLXrl0bWZ0zxb7OPv75oRf43uPbGcrlecfpJ3HZG0/iwkVNNFaVxl2eiEwzZrbO3VeO+7gpDoIOd68b8ftD7j5u95CC4LXau/r5+i9e4nu/foVDPYOYwW/NqeW9y+dw5fLZCgURAYogCMzseuB6gLa2thXbtm2LrM6ZKpd3nt55mJ89387qTXt5audhMinjkiUtvGfZbC5d0kJVqVYjFUmq6RoE6hqK0HN7urjziR3c9eRO9nX1U5JJcdHiJt67fA5vXzqLsmw67hJFZAoVGgRT/efiPcC1wA3h/d1T/P5F7bSTqvmry9/AqsuWsO6VQ9z/1B7uf3o3P3l2H9WlGd55xkm884xWzju5UaEgIkdF1iIws9uBi4EmYC/wN8BdwPeBNuAV4Cp3H3dlFrUIJi6fd3619QB3PrGTB57eTfdAjoqSNBcuauLsBQ0saa1myUk1NFdrXEGk2EyLrqHJoiCYHH2DOR7deoA1z+7loc3t7OzoPfq7uoosC5sqWdhUyRlzajn/lCZOnVWFmcVYsYicCAWBjOtg9wCb93SyeXcXL7YfYWt7Ny+2H2FfVz8ATVUlvG3pSVx34UIWtVTFXK2IHC8FgUzY9oM9PLr1AD/fsp8Hn9lD/1CeS5e08J5lrcyqKaOlupTm6jJqyjJqMYhMY9N1sFhmgHkNFcxrqODqlfM4cKSfb/1qG996dBs/3bzvNY+rLEkzu66cOfXlLGisZH5jBQuaKjmlqYo59eWkUwoJkZlALQIpSP9Qju0He9jX1U97Vz97O/vY1dHH7sO9bD/Yy7YD3XQP5I4+viSTYk5dOZmUYQaZVIpZNaW01pUzr76Ci05tYmlrjVoUIhFSi0AmVWkmzaKWaha1VI/6e3dn/5EBXj7Qzdb2I7zY3s3Ojl7cnXweBnJ59hzuY/32Dg71DPKlB2B+YwVve8MsGqtKSRmkzMimjZJMOrxPkUmlyKSNtBnp8H4on6d3IE/vYHAG1Oy6cubUldNQWaJWiMgEKAhkUpgZzdWlNFeXcvaChmM+dv+RflZv2suPntrNN375MkOTOL12JmWUZlKkU4Y75N1JmVFWkqY8m6aiJE1NWZaa8gwVJRlyeWconyfvUJZNU55NUVGSoaGyhMaqEhorS6goyYS/S1NXkaWhsoSKkrRaM1I0FAQy5ZqqSvnwOW18+Jw2BnN5cnkn704u7wzmnIGhPANDeQbzeYZyzmAuf/T3ubyTTacoL0lTlklzpH+InR297DzUQ0fv4NHnDuWDAEgZ5NzpG8zTN5jjSP8QXX2D7Oroo2dgiHTKyKaDSXj7h/L0DuTo7h+iq3/omMdQkklRU5ahuixLdVnwv9HAUJ6BXJ5sKkVlaZrK0gzplJH34HoOsyAwUwalmRS15VlqyrKUl6SPhlYu7/QN5ugdzJHLQ3VZhuqyDLXlWZqrS5lVU0ZTVQlmhrvjHqx7nc2kyKaNVBhOBtSUZ48em8ixKAgkVtl0ihO9yHnp7JrJKWaE/qEcB7sHOHBkgL7BHH2DeXoGhujoHeRQ9wAHuwfo7AtC5Uj/EMbwsaQYzOXpGcjR1TeEux/98gfIe9CN1juYo7N3iMO9g/QN5UiZYUA6ZZSHrZeU2dHXn2ijqbY8S2NVCfUVJdSVZ6mtyJJJGbk85MKW0PBLG0GLKhWGY2kmuGXCgDE4+ruSdIpUysjlg9B1D04eqCzNvCbY3CGTNjKpIKjMwD288epjKkvT1JZnqS3PUlGSoaIkTVk2TSp8fN49CFQP/mhIp4ySdGrKWmW9Aznau/rZ393P4FD+6P5MOvz3yKQoz6apKs1QVZahNDOzrtxXEIiMojSTprW2nNba8rhLwd3p7BuivauPvZ397D8SXOcx/Nf/UD4ftkbCb1iCCQkP9w5xsLuf/d0DdPQMsKezj817usjlgy/SdCoIKAu/5PPu5NzJ5ZyBnDMwlGMgl2cw58FYzzQ7rySdMiqyaVIpYzAX/Bs4BONLYSi/epxhC4rgOEcygoAygtZj0F0Y3OfDn/tHfPkXanjcK5UyMmEdw/fBz6mj+8yCPxKGW73DXZaDOeeuj1/AwqbKSfk3G4uCQGSaM7Ojfy2PNVg/VfJ5ZzD8gsrlPBjID5s7PWG3Wu9g7jUBk8s7A7mgm2+Y2atBNvzczt5BDvcO0jMQvEbPQA734HFmwRe/GeEJA07PwBA9Azlyeack/KvcjKPdi8NdikO5IMSC93z1i3/Y0dYLvOaLOp0KT1JIGbUVWZqrSmmqKqU0E3S3+fCxhV2CvQNB1+OR/iH6h/JheAZBMhwoQ7kgbPNhN+jI36dSRtrCVld4kkQ2HXQzRk1BICIFS6WM0lSa0WY3L8umaagsmfqi5IRpJElEJOEUBCIiCacgEBFJOAWBiEjCKQhERBJOQSAiknAKAhGRhFMQiIgk3IxYj8DM2oFtE3x6E7B/EsuZKZJ43Ek8ZkjmceuYCzPf3ZvHe9CMCIITYWZrC1mYodgk8biTeMyQzOPWMU8udQ2JiCScgkBEJOGSEAQ3x11ATJJ43Ek8ZkjmceuYJ1HRjxGIiMixJaFFICIix1DUQWBml5nZc2b2gpmtirueKJjZPDN7yMyeNbNnzOxT4f4GM1ttZlvC+/q4a51sZpY2syfN7L5we6GZPRYe83+YWdFNjm9mdWZ2h5ltDj/z84r9szazz4T/bT9tZrebWVkxftZm9nUz22dmT4/YN+pna4F/Cr/bNprZWSfy3kUbBGaWBv4FeCewFPiwmS2Nt6pIDAF/5u5vAM4FPh4e5ypgjbsvBtaE28XmU8CzI7a/BPxjeMyHgOtiqSpaXwUecPclwDKC4y/az9rM5gCfBFa6+xuBNPAhivOz/gZw2ev2jfXZvhNYHN6uB246kTcu2iAAzgFecPet7j4AfA+4IuaaJp2773b3J8Kfuwi+GOYQHOtt4cNuA66Mp8JomNlc4F3A18JtAy4F7ggfUozHXANcBNwC4O4D7t5BkX/WBCsplptZBqgAdlOEn7W7PwIcfN3usT7bK4BveuBXQJ2ZtU70vYs5COYA20ds7wj3FS0zWwCcCTwGzHL33RCEBdASX2WRuBH4PDC8qngj0OHuQ+F2MX5cslOfAAAEJUlEQVTeJwPtwK1hl9jXzKySIv6s3X0n8GXgFYIAOAyso/g/62FjfbaT+v1WzEFgo+wr2lOkzKwKuBP4tLt3xl1PlMzs3cA+d183cvcoDy22zzsDnAXc5O5nAt0UUTfQaMI+8SuAhcBsoJKgW+T1iu2zHs+k/vdezEGwA5g3YnsusCumWiJlZlmCEPiOu/8w3L13uKkY3u+Lq74IXAC818xeJujyu5SghVAXdh9AcX7eO4Ad7v5YuH0HQTAU82f9VuAld29390Hgh8D5FP9nPWysz3ZSv9+KOQgeBxaHZxeUEAww3RNzTZMu7Bu/BXjW3b8y4lf3ANeGP18L3D3VtUXF3f/S3ee6+wKCz/Wn7n4N8BDwgfBhRXXMAO6+B9huZqeFu94CbKKIP2uCLqFzzawi/G99+JiL+rMeYazP9h7gd8Ozh84FDg93IU2IuxftDbgceB54EfjruOuJ6BgvJGgSbgTWh7fLCfrM1wBbwvuGuGuN6PgvBu4Lfz4Z+DXwAvADoDTu+iI43uXA2vDzvguoL/bPGvgCsBl4GvgWUFqMnzVwO8E4yCDBX/zXjfXZEnQN/Uv43fYUwVlVE35vXVksIpJwxdw1JCIiBVAQiIgknIJARCThFAQiIgmnIBARSTgFgRQ9MzsS3i8ws/82ya/9V6/b/uVkvr7IVFAQSJIsAI4rCMJZbI/lNUHg7ucfZ00isVMQSJLcALzZzNaHc9ynzezvzezxcE73PwQws4vDNR6+S3CxDmZ2l5mtC+fFvz7cdwPBrJjrzew74b7h1oeFr/20mT1lZh8c8doPj1hT4DvhFbOY2Q1mtims5ctT/q8jiZUZ/yEiRWMV8Ofu/m6A8Av9sLufbWalwC/M7MfhY88B3ujuL4Xb/93dD5pZOfC4md3p7qvM7BPuvnyU93ofwVXAy4Cm8DmPhL87EzidYG6YXwAXmNkm4HeAJe7uZlY36UcvMga1CCTJ3k4wX8t6gqm7GwkW+gD49YgQAPikmW0AfkUw2ddiju1C4HZ3z7n7XuBnwNkjXnuHu+cJpgRZAHQCfcDXzOx9QM8JH51IgRQEkmQG/Km7Lw9vC919uEXQffRBZhcTzIJ5nrsvA54Eygp47bH0j/g5B2Q8mFv/HIJZZK8EHjiuIxE5AQoCSZIuoHrE9oPAH4fTeGNmp4YLvbxeLXDI3XvMbAnBkqDDBoef/zqPAB8MxyGaCVYW+/VYhYXrSdS6+4+ATxN0K4lMCY0RSJJsBIbCLp5vEKz/uwB4IhywbWf0JQ8fAP7IzDYCzxF0Dw27GdhoZk94MBX2sP8EzgM2EMwO+3l33xMGyWiqgbvNrIygNfGZiR2iyPHT7KMiIgmnriERkYRTEIiIJJyCQEQk4RQEIiIJpyAQEUk4BYGISMIpCEREEk5BICKScP8fLRxq/DFspOQAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<matplotlib.figure.Figure at 0x21a4eeaf390>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.plot(test_score)\n",
    "plt.xlabel('Iterations')\n",
    "plt.ylabel('Least squares Loss')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 5. VotingClassifier\n",
    "* Core concept of VotingClassifier is to combine conceptually different machine learning classifiers and use a majority vote or weighted vote to predict the class labels. \n",
    "* Voting classifier is quite effective with good estimators & handles individual's limitations, ensemble methods can also participate. \n",
    "* Types of Voting Classifier\n",
    "  - Soft Voting Classifier, different weights configured to different estimator\n",
    "  - Hard Voting Classifier, all estimators have equal weighage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Problem : DIGIT identification using VotingClassifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.ensemble import VotingClassifier,RandomForestClassifier,AdaBoostClassifier\n",
    "from sklearn.svm import SVC\n",
    "from sklearn.linear_model import LogisticRegression\n",
    "from sklearn.tree import DecisionTreeClassifier\n",
    "from sklearn.neighbors import KNeighborsClassifier\n",
    "from sklearn.datasets import load_digits\n",
    "from sklearn.model_selection import train_test_split"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {},
   "outputs": [],
   "source": [
    "estimators = [ \n",
    "    ('rf',RandomForestClassifier(n_estimators=20)),\n",
    "    ('svc',SVC(kernel='rbf', probability=True)),\n",
    "    ('knc',KNeighborsClassifier()),\n",
    "    ('abc',AdaBoostClassifier(base_estimator=DecisionTreeClassifier() ,n_estimators=20)),\n",
    "    ('lr',LogisticRegression()) \n",
    "]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {},
   "outputs": [],
   "source": [
    "vc = VotingClassifier(estimators=estimators, voting='hard')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 106,
   "metadata": {},
   "outputs": [],
   "source": [
    "digits = load_digits()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 107,
   "metadata": {},
   "outputs": [],
   "source": [
    "X,y = digits.data, digits.target"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 108,
   "metadata": {},
   "outputs": [],
   "source": [
    "trainX, testX, trainY, testY = train_test_split(X,y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\awant\\Anaconda3\\lib\\site-packages\\sklearn\\svm\\base.py:196: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.\n",
      "  \"avoid this warning.\", FutureWarning)\n",
      "C:\\Users\\awant\\Anaconda3\\lib\\site-packages\\sklearn\\linear_model\\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n",
      "  FutureWarning)\n",
      "C:\\Users\\awant\\Anaconda3\\lib\\site-packages\\sklearn\\linear_model\\logistic.py:459: FutureWarning: Default multi_class will be changed to 'auto' in 0.22. Specify the multi_class option to silence this warning.\n",
      "  \"this warning.\", FutureWarning)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "VotingClassifier(estimators=[('rf', RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',\n",
       "            max_depth=None, max_features='auto', max_leaf_nodes=None,\n",
       "            min_impurity_decrease=0.0, min_impurity_split=None,\n",
       "            min_samples_leaf=1, min_samples_split=2,\n",
       "            min_we...penalty='l2', random_state=None, solver='warn',\n",
       "          tol=0.0001, verbose=0, warm_start=False))],\n",
       "         flatten_transform=None, n_jobs=None, voting='hard', weights=None)"
      ]
     },
     "execution_count": 109,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vc.fit(trainX,trainY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9711111111111111"
      ]
     },
     "execution_count": 110,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vc.score(testX,testY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 111,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "rf 0.9666666666666667\n",
      "svc 0.4822222222222222\n",
      "knc 0.9822222222222222\n",
      "abc 0.8688888888888889\n",
      "lr 0.9555555555555556\n"
     ]
    }
   ],
   "source": [
    "for est,name in zip(vc.estimators_,vc.estimators):\n",
    "    print (name[0], est.score(testX,testY))\n",
    "    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "metadata": {},
   "outputs": [],
   "source": [
    "vc = VotingClassifier(estimators=estimators, voting='soft', weights=[2,.1,3,2,2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 119,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "C:\\Users\\awant\\Anaconda3\\lib\\site-packages\\sklearn\\svm\\base.py:196: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.\n",
      "  \"avoid this warning.\", FutureWarning)\n",
      "C:\\Users\\awant\\Anaconda3\\lib\\site-packages\\sklearn\\linear_model\\logistic.py:432: FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. Specify a solver to silence this warning.\n",
      "  FutureWarning)\n",
      "C:\\Users\\awant\\Anaconda3\\lib\\site-packages\\sklearn\\linear_model\\logistic.py:459: FutureWarning: Default multi_class will be changed to 'auto' in 0.22. Specify the multi_class option to silence this warning.\n",
      "  \"this warning.\", FutureWarning)\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "VotingClassifier(estimators=[('rf', RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',\n",
       "            max_depth=None, max_features='auto', max_leaf_nodes=None,\n",
       "            min_impurity_decrease=0.0, min_impurity_split=None,\n",
       "            min_samples_leaf=1, min_samples_split=2,\n",
       "            min_we...penalty='l2', random_state=None, solver='warn',\n",
       "          tol=0.0001, verbose=0, warm_start=False))],\n",
       "         flatten_transform=None, n_jobs=None, voting='soft',\n",
       "         weights=[2, 0.1, 3, 2, 2])"
      ]
     },
     "execution_count": 119,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vc.fit(trainX,trainY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 120,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9777777777777777"
      ]
     },
     "execution_count": 120,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vc.score(testX,testY)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
